High-Dimensional Simplexes for Supermetric Search
نویسندگان
چکیده
In 1953, Blumenthal showed that every semi-metric space that is isometrically embeddable in a Hilbert space has the n-point property; we have previously called such spaces supermetric spaces. Although this is a strictly stronger property than triangle inequality, it is nonetheless closely related and many useful metric spaces possess it. These include Euclidean, Cosine and Jensen-Shannon spaces of any dimension. A simple corollary of the n-point property is that, for any (n + 1) objects sampled from the space, there exists an n-dimensional simplex in Euclidean space whose edge lengths correspond to the distances among the objects. We show how the construction of such simplexes in higher dimensions can be used to give arbitrarily tight lower and upper bounds on distances within the original space. This allows the construction of an n-dimensional Euclidean space, from which lower and upper bounds of the original space can be calculated, and which is itself an indexable space with the n-point property. For similarity search, the engineering tradeoffs are good: we show significant reductions in data size and metric cost with little loss of accuracy, leading to a significant overall improvement in search performance.
منابع مشابه
Supermetric Search with the Four-Point Property
Metric indexing research is concerned with the efficient evaluation of queries in metric spaces. In general, a large space of objects is arranged in such a way that, when a further object is presented as a query, those objects most similar to the query can be efficiently found. Most such mechanisms rely upon the triangle inequality property of the metric governing the space. The triangle inequa...
متن کاملAn effective computation strategy for assessing operational flexibility of high-dimensional systems with complicated feasible regions
The volumetric flexibility index ( ) FIv of a chemical system can be viewed geometrically as the ratio between the hypervolume of feasible region and that of a hypercube bounded by the expected upper and lower limits of uncertain process parameters. Although several methods have already been developed to compute FIv, none of them are effective for solving the high-dimensional problems defined i...
متن کاملSupermetric Search
Metric search is concerned with the efficient evaluation of queries in metric spaces. In general, a large space of objects is arranged in such a way that, when a further object is presented as a query, those objects most similar to the query can be efficiently found. Most mechanisms rely upon the triangle inequality property of the metric governing the space. The triangle inequality property is...
متن کاملSimplexes Multi Dimensional Scaling and Self Organized Mapping
Abstract The self organizingmap SOM of Kohonen is one of the most successful models of unsupervised learning Its popularity is partially due to the visualization topography preservation of relations among clusters in high dimensional input space SOM learns slowly especially in the initial phase and the preservation of topography by SOM maps is not based on any quantitative criteria We have obta...
متن کاملOn the Join of Two Complexes
2. Definition of the join (Ki, K2) of Ki and K2. To define the join of two complexes we first define the join (o-, r) of a ^-dimensional simplex a and a g-dimensional simplex r, p, <7 = 0, 1, • • • . This join is a (£+<Z + l)-dimensional simplex with a ^-dimensional side associated with a and the opposite side, which is ^-dimensional, associated with r. These sides will not be distinguished fro...
متن کامل